University of Copenhagen Algorithms for Protein Structure Prediction
نویسنده
چکیده
The problem of predicting the three-dimensional structure of a protein given its amino acid sequence is one of the most important open problems in bioinformatics. One of the carbon atoms in amino acids is the Cα-atom and the overall structure of a protein is often represented by a so-called Cα-trace. Here we present three different approaches for reconstruction of Cα-traces from predictable measures. In our first approach [63, 62], the Cα-trace is positioned on a lattice and a tabu-search algorithm is applied to find minimum energy structures. The energy function is based on half-sphere-exposure (HSE) and contact number (CN) measures only. We show that the HSE measure is much more information-rich than CN, nevertheless, HSE does not appear to provide enough information to reconstruct the Cα-traces of real-sized proteins. Our experiments also show that using tabu search (with our novel tabu definition) is more robust than standard Monte Carlo search. In the second approach for reconstruction of Cα-traces, an exact branch and bound algorithm has been developed [67, 65]. The model is discrete and makes use of secondary structure predictions, HSE, CN and radius of gyration. We show how to compute good lower bounds for partial structures very fast. Using these lower bounds, we are able to find global minimum structures in a huge conformational space in reasonable time. We show that many of these global minimum structures are of good quality compared to the native structure. Our branch and bound algorithm is competitive in quality and speed with other state-of-the-art decoy generation algorithms. Our third Cα-trace reconstruction approach is based on bee-colony optimization [24]. We demonstrate why this algorithm has some important properties that makes it suitable for protein structure prediction. Our approach for model quality assessment (MQA) [64] makes use of distance constraints extracted from alignments to templates. We show how to use CN probabilities in an optimization algorithm for selecting good distance constraints and we introduce the concept of non-contacts. When comparing our algorithm with state-of-the-art MQA algorithms on the CASP7 benchmark, our algorithm is among the top-ranked algorithms. We are currently participating in CASP8 MQA with this algorithm.
منابع مشابه
Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches
DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...
متن کاملPrediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks
Background: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from pro...
متن کاملOn the reference ratio method and its application to statistical protein structure prediction
Thomas Hamelryck, John Haslett, Kanti Mardia, John T. Kent, Jan Valentin, Jes Frellsen, Jesper Ferkinghoff-Borg The Bioinformatics center, University of Copenhagen, Denmark School of Computer Science and Statistics,Trinity College, Dublin Department of Statistics, School of Mathematics, The University of Leeds, UK Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby,...
متن کاملPhysicochemical Position-Dependent Properties in the Protein Secondary Structures
Background: Establishing theories for designing arbitrary protein structures is complicated and depends on understanding the principles for protein folding, which is affected by applied features. Computer algorithms can reach high precision and stability in computationally designing enzymes and binders by applying informative features obtained from natural structures. Methods: In this study, a ...
متن کاملPrediction of Breast Tumor Malignancy Using Neural Network and Whale Optimization Algorithms (WOA)
Introduction: Breast cancer is the most prevalent cause of cancer mortality among women. Early diagnosis of breast cancer gives patients greater survival time. The present study aims to provide an algorithm for more accurate prediction and more effective decision-making in the treatment of patients with breast cancer. Methods: The present study was applied, descriptive-analytical, based on the ...
متن کاملIn silico investigation of lactoferrin protein characterizations for the prediction of anti-microbial properties
Lactoferrin (Lf) is an iron-binding multi-functional glycoprotein which has numerous physiological functions such as iron transportation, anti-microbial activity and immune response. In this study, different in silico approaches were exploited to investigate Lf protein properties in a number of mammalian species. Results showed that the iron-binding site, DNA and RNA-binding sites, signal pepti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009